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Abstract 

De-novo reverse-engineering of genome-scale regulatory networks is a fundamental problem of biological and translational 
research. One of the major obstacles in developing and evaluating approaches for de-novo gene network reconstruction is 
the absence of high-quality genome-scale gold-standard networks of direct regulatory interactions. To establish a 
foundation for assessing the accuracy of de-novo gene network reverse-engineering, we constructed high-quality genome- 
scale gold-standard networks of direct regulatory interactions in Saccharomyces cerevisiae that incorporate binding and 
gene knockout data. Then we used 7 performance metrics to assess accuracy of 18 statistical association-based approaches 
for de-novo network reverse-engineering in 13 different datasets spanning over 4 data types. We found that most 
reconstructed networks had statistically significant accuracies. We also determined which statistical approaches and 
datasets/data types lead to networks with better reconstruction accuracies. While we found that de-novo reverse- 
engineering of the entire network is a challenging problem, it is possible to reconstruct sub-networks around some 
transcription factors with good accuracy. The latter transcription factors can be identified by assessing their connectivity in 
the inferred networks. Overall, this study provides the gene network reverse-engineering community with a rigorous 
assessment of the accuracy of 5. cerevisiae gene network reconstruction and variability in performance of various 
approaches for learning both the entire network and sub-networks around transcription factors. 
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Introduction 

One of the fundamental problems of modern biology is reverse- 
engineering of genome-scale regulatory networks. Addressing this 
problem is essential to expanding understanding of normal and 
pathologic cellular conditions and can lead to development of new 
drugs and therapies. While there are many databases that store 
biological pathways (e.g., KEGG and Ingenuity Pathway Analy- 
sis), these databases are often inaccurate and/or incomplete 
because their knowledge is derived from a multitude of biological 
systems and conditions that may not correspond to the problem at 
hand. Furthermore, pathways in these databases are affected by 
variability of the employed computational and experimental 
methods and their reproducibility characteristics [1-3]. Therefore, 
there is a strong need for reverse-engineering of genome-scale 
regulatory networks de novo from data. 

Gene regulatory networks can be constructed by integrating 
targeted perturbation data (e.g., gene knockouts or overexpression 
of transcription factors) with binding data (e.g., chromatin 
immunoprecipitation) (Figure 1). By knocking-out/deleting or 
over-expressing transcription factor X and comparing the 



expression level of other genes with the wild-type strain, one can 
determine regulatory targets of X. On the other hand, a binding 
assay allows identification of the binding targets of X. The overlap 
of regulatory targets and binding targets defines the set of direct 
regulatory targets of X which are graphically represented in gene 
regulatory networks. While modern methods in biology enable 
performing such studies in a variety of model systems, they are 
typically expensive to perform on a genome-scale and often 
unfeasible in humans. 

However, the wide-spread use of genomic profiling technologies 
over the last two decades led to development of thousands of 
observational, i.e. non-perturbation datasets (e.g., from case- 
control and case-series studies), that are freely available in public 
repositories such as GEO [4] and ArrayExpress [5]. In addition, 
the computational community has recendy provided many 
algorithms that can infer causal relations from non-perturbation 
data [6-10]; some of them have been adopted to accommodate 
the high dimensionalities of modern genomics data [11,12], and 
some methods even lead to Nobel awards in domains outside of 
biomedicine [13-16]. The question is whether these computa- 
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Figure 1. Construction of gene regulatory networks by integrating targeted perturbation data with binding data. The relations in 
constructed gene regulatory network correspond to direct regulatory interactions. 
doi:1 0.1 371 /journal.pone.01 06479.g001 



tional methodologies can accurately learn de-novo gene regulatory 
networks from highly abundant data in the public domain? 

Fortunately, this question has recently received attention in the 
scientific community [17-21]. However, the major obstacle in 
testing gene network reverse-engineering methods is the absence of 
high-quality genome-scale gold-standards of direct regulatory 
interactions that are derived by integrating targeted perturbation 
with binding data (see Table 1). Another problem is that 
currently the scientific community primarily uses perturbation 
data for gene network inference (many studies use compendium 
microarray data that is obtained by merging a large number of 
studies, predominantly with deletion mutants), while results based 
on observational data are more important, since the latter data is 
easier and cheaper to obtain. In general, it is unknown what types 
of datasets are more suitable for gene network reverse-engineering 
studies. 



To address gaps in prior research, this study focuses on S. 
cerevisiae, one of the most well-studied model organisms with a 
wide range of available genome-scale data. We first constructed 
high-quality genome-scale gold-standards of regulatory interaction 
and then assessed 18 statistical association-based approaches (from 
both bivariate analysis and multivariate causal graph-based 
methods) for de-novo network reverse-engineering in 1 3 different 
datasets that span over 4 data types: (i) observational data 
consisting of biological wild-type replicates, (ii) observational data 
obtained across time and/or environmental conditions, (iii) 
compendium (semi-perturbation) data, and (iv) perturbation data. 
This study uses de-novo methods based on statistical association 
[1 1,12,22-24] because they are state-of-the-art [19] and are most 
prevalent in the community. In the course of this study, the 
following four questions are addressed: First, how accurately can 
one infer genome-scale networks with statistical association-based 
de-novo methods? Second, which datasets/ data designs should be 



Table 1. Assessment of currently available genome-scale gold-standard networks used by prior gene network reverse-engineering 
studies. 





Gold-standard 


Description 


Limitations 


Used by 


#1 


E. Coli network from RegulonDB, a curated database of 
regulatory interactions obtained through literature search [50] 


• Unknown quality 


DREAM2 [17], DREAM5 
[18], [19], [21] 


• Heterogeneous data sources and 
experimental methods 


#2 


S. Cerevisiae network from binding data [51] 


• Binding relations can be non-functional [28] 


[20] 


• Higher quality binding data exists [33] 
and is utilized in gold-standard #3 


#3 


S. Cerevisiae network from binding data [33] 


• Binding relations can be non-functional [28] 


DREAM5 [18], [19], [21] 


#4 


S. Cerevisiae network from YEASTRACT, a curated database of 
regulatory interactions obtained through literature search [52,53] 


• Unknown quality 


DREAM5 [18] 


• Heterogeneous data sources and 
experimental methods 


#5 


S. Cerevisiae network from deletion mutants [54] 


• Inferred transcription relations can be indirect 


DREAM5 [18] 
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used for network inference? Third, which statistical methods lead 
to better accuracy? Fourth, is it possible to identify sub-networks in 
the entire network that can be reconstructed with high accuracy? 
To make conclusions of the study more useful to the community, 
results for 7 commonly used performance metrics are reported. 

Results 

Gold-standard gene regulatory networks integrate 
transcription factor-gene binding with perturbation 
(deletion mutants) data 

The analysis of targeted perturbation (deletion mutants) data 
described in the Methods section resulted in a network with 
991,444 regulatory relations involving 5,395 genes, including 118 
transcription factors (Spreadsheet SI). 

The analysis of binding data described in the Methods section 
resulted in the following three networks: Binding network $1 
(most conservative) involves 2,075 genes (including 1 14 transcrip- 
tion factors) and 4,034 binding relations. Binding network #2 
(intermediate) involves 3,113 genes (including 115 transcription 
factors) and 8,392 binding relations. Binding network ^3 (most 
liberal) involves 3,955 genes (including 116 transcription factors) 
and 13,050 binding relations. All identified binding interactions 
are provided in Spreadsheet S2. 

Integration of binding and perturbation data resulted in three 
gold-standard networks with direct regulatory interactions (Ta- 
ble 2). Identified direct regulatory interactions are listed in 
Spreadsheet S3. Figures 2 and 3 visualize the gold-standard 
network $1 for all genes and only transcription factors, 
respectively. Figure 4 presents a topological analysis of that 
gold-standard network. Similar data is provided for gold-standard 
networks #2 and #3 in Figures S1-S6. 

Assessment of the accuracy of network learning with 
sensitivity and specificity metrics 

The network reconstruction results presented below were 
obtained from the most conservative gold-standard network #1 
(Table 2). Results from the remaining two gold-standard 
networks are similar and are provided in Tables S4-S9. 

Table 3 provides values of sensitivity and specificity and 
Table 4 provides a combined sensitivity/ specificity Euclidean 
distance-based metric (see Methods) for 18 statistical approaches 
for reverse-engineering applied to 13 datasets, resulting in 234 
inferred networks (see Table SI for a colored version of Table 3 
and Table 4, where color denotes ranking of performances). The 
best result for combined sensitivity/ specificity metric ( = 0.64, 
corresponding to sensitivity = 0.52 and specificity = 0.58) is 
achieved in Hughes2 dataset by application of bivariate analysis 
with G2 test and 5% alpha threshold. The best 5% ranking results 
(see Table SI part B) according to the combined metric (12 
networks out of 234) correspond to bivariate analysis (10 networks) 
and GLL with conditioning on one gene (2 networks). In terms of 
datasets, 4 out of 1 2 best networks originate from Hughes 1 , 4 from 
Hughes2, 2 from GPL90, and 2 from Gasch. There is a large 
variability in accuracy of statistical approaches averaged over 1 3 
datasets, and the most accurate approaches are bivariate 
(combined metric =0.75-0.77 versus 0.85-0.98 for other meth- 
ods). The variability in accuracy of datasets averaged over 18 
statistical approaches is smaller, and the best results are achieved 
in Gresham (combined metric = 0.82), Smith (0.84), and 
Holstege4 (0.84) datasets (versus 0.85-0.89 for the remaining 
datasets). If we perform averaging over all statistical approaches 
and datasets belonging to the same data type, the best accuracy is 
achieved by observational data due to change in time/environ- 



ment and by compendium data (combined metric = 0.86), 
followed by perturbation data (0.87) and observational data 
consisting of biological wild-type replicates (0.88). 

Figure 5 provides an additional visualization of sensitivity/ 
specificity pairs for 1 8 statistical approaches x 1 3 datasets and the 
corresponding ROC curve [25,26] of the Pareto frontier [27]. The 
resulting area under ROC curve (AUROC) is 0.546 (p-value 
= 1.12xl0~ 7 ). Figure 6 shows ROC curves and reports AUROC 
for each data type separately. It follows that observational data 
consisting of biological wild-type replicates leads to least accurate 
networks with AUROC consistent with prediction by chance 
(AUROC = 0.499, p-value = 0.55). Other data types lead to small 
but statistically significant AUROC values, with the best result 
achieved by perturbation data (AUROC =0.541, p-value 
= 1.73xl0" 6 ), Mowed by compendium data (AUROC 
= 0.536, p-value =2.57x10 ) and observational data due to 
change in time /environment (AUROC = 0.521, p-value =0.01). 

Assessment of the accuracy of network learning with 
positive and negative predictive value metrics 

Table 5 provides values of positive predictive value (PPV) and 
negative predictive value (NPV) and Table 6 provides a 
combined PPV/NPV Euclidean distance-based metric (see Meth- 
ods) for 1 8 statistical approaches for reverse-engineering applied to 
13 datasets, resulting in 234 inferred networks (see Table S2 for a 
colored version of Table 5 and Table 6, where color denotes 
ranking of performances). The best result for combined PPV/NPV 
metric ( = 0.93, corresponding to PPV = 0.07 and NPV = 0.98) is 
achieved in the Smith dataset by application of GLL with a Z test, 
conditioning on 3 genes and using an AND rule. The best 5% 
ranking results (see Table S2 part B) according to the combined 
metric (17 networks out of 234) correspond to GLL with 
conditioning on either 2 or 3 genes. In terms of datasets, 5 out 
of 1 7 best networks originate from Yeung, 3 from Smith, 3 from 
Gasch, 3 from Hughes2, and the remaining 3 originate from 
M3D, GPL90, and Holstege4. There is a small variability in 
accuracy of statistical approaches averaged over 1 3 datasets, and 
the most accurate approach is GLL with Z test, conditioning on 3 
genes and using an AND rule (combined metric = 0.96 versus 
0.97-0.98 for other methods). The variability in accuracy of 
datasets averaged over 18 statistical approaches is even smaller, 
and the best results are achieved in Gasch, Smith, Yeung, and 
Hughes2 datasets (combined metric = 0.97 versus 0.98 for the 
remaining datasets). If we perform averaging over all statistical 
approaches and datasets belonging to the same data type, the best 
accuracy is achieved by observational data due to change in time/ 
environment (0.97), followed by other data types (0.98). 

Assessment of the accuracy of network learning with 
recall and precision metrics 

Table 7 provides values of recall (sensitivity) and precision 
(PPV) and Table 8 provides a combined recall/precision 
Euclidean distance-based metric (see Methods) for 18 statistical 
approaches for reverse-engineering applied to 13 datasets, 
resulting in 234 inferred networks (see Table S3 for a colored 
version of Table 7 and Table 8, where color denotes ranking of 
performances). The best results for combined recall/precision 
metric (= 0.99, corresponding to recall = 0.89-0.91 and precision 
= 0.02) are achieved in GPL90 and M3D datasets by application 
of bivariate analysis with G2 test. The best 5% ranking results (see 
Table S3 part B) according to the combined metric (17 networks 
out of 234) also correspond to bivariate analysis. In terms of 
datasets, 5 out of 1 7 best networks originate from GPL90, 3 from 
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Figure 2. Gold-standard gene regulatory network #1. Transcription factors are shown with large blue circles, and other genes are shown with 
small green circles. Edges in the network represent direct regulatory interactions. Inhibiting edges are shown with red, and excitatory edges are 
shown with black. 

doi:1 0.1 371 /journal.pone.01 06479.g002 



M3D, 3 from Yeung, 3 from Smith, and 3 from Holstege2. There 
is a large variability in accuracy of statistical approaches averaged 
over 13 datasets, and the most accurate approaches are bivariate 
(combined metric = 1.04—1.09 versus 1.27-1.38 for other meth- 
ods). The variability in accuracy of datasets averaged over 18 
statistical approaches is smaller, and the best results are achieved 
in GPL90 (combined metric =1.19), Smith (1.20), and Gresham 
(1.20) datasets (versus 1.21-1.31 for the remaining datasets). If we 
perform averaging over all statistical approaches and datasets 
belonging to the same data type, the best accuracy is achieved by 
compendium data (1.20), followed by observational data due to 
change in time/environment (1.23), observational data consisting 
of biological wild- type replicates (1.26), and perturbation data 
(1.27). 

Connectivity of transcription factors is correlated with 
the accuracy of learning their sub-networks 

Despite the overall low but statistically significant accuracies of 
gene network reverse-engineering in S. cerevisiae, some pathways 
or sub-networks can be learned with high accuracy from this data. 



For example, application of GLL method (with Fisher's Z-test and 
conditioning on one gene) to Yeung dataset allowed us to learn a 
sub-network of direct regulatory interactions of transcription factor 
GCN4 (containing 44 genes) with sensitivity = 0.50, specificity 
= 0.91, PPV = 0.24, NPV = 0.97, which is statistically significant 
after adjustment for multiple comparison (Figure S7). We 
hypothesize that total connectivity of transcription factors (assessed 
either in gold-standard or inferred networks) is correlated with the 
reconstruction accuracy of their sub-networks. If this hypothesis is 
true, the connectivity measure may be used to identify transcrip- 
tion factors whose sub-networks can be learned accurately by de 
novo reverse-engineering methods. 

The left panel of Figure 7 provides a scatter-plot showing 
significant correlation of transcription factor connectivity with the 
accuracy (combined PPV/NPV) of de novo reconstructing 
transcription factor sub-networks (that contain only direct 
regulatory interactions of each transcription factor). The right 
panel of Figure 7 shows the null distribution for assessing 
statistical significance of this correlation. Table 9 reports for 
each reverse-engineering approach and accuracy metric, the 
number of networks (in total we have 13 networks that were 
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Figure 3. Direct regulatory interactions between transcription factors in gold-standard gene regulatory network #1 . Inhibiting edges 
are shown with red, and excitatory edges are shown with black. 
doi:1 0.1 371 /journal.pone.01 06479.g003 

derived from 13 microarray gene expression datasets) with 
statistically significant correlation between connectivity of tran- 
scription factors and accuracy of reconstructing their sub- 
networks. As can be seen, for most reverse-engineering methods 
and accuracy metrics, connectivity of transcription factors in the 
inferred networks is significantly correlated with the reconstruction 
accuracy of their sub-networks. The correlations are sometimes 
robust and hold in multiple networks inferred from various 
datasets. However, the transcription factor connectivity assessed in 
the gold-standard networks correlates less robustly with the 
accuracy metrics; especially the combined sensitivity/ specificity 
is rarely correlated. Overall, the correlations are typically negative, 
which implies that reverse-engineering methods can achieve 
higher accuracy (using each of the three combined distance 
metrics) for transcription factors with larger connectivity (i.e., more 
direct regulatory interactions). This behavior is particularly 
interesting for the combined sensitivity/specificity metric which 
is not influenced by the density of the network. 

Methods and Materials 

Construction of the gold-standard networks of direct 
gene regulatory interactions 

The general process for construction of gold-standard networks 
with direct gene regulatory interactions is illustrated in Figure 1 . 
Two types of genome-scale data are required for network 
construction: (i) targeted perturbation data with gene knocks- 
outs /deletions or over-expressions that can be obtained by 



techniques for interference with RNA such as shRNA/ siRNA or 
inducible promoters, and (ii) binding data that can be obtained by 
chromatin immunoprecipitation (ChIP) methods such as ChlP- 
chip/ ChlP-seq. Targeted perturbation data allows identification of 
regulatory targets, while binding data allows identification of 
binding targets of transcription factors. Using either data alone is 
not sufficient to infer direct regulatory relations because regulatory 
interactions resulting from targeted perturbation data may be 
either direct or indirect, and likewise binding interactions can be 
either functional or not [28] . Therefore, we integrated regulatory 
and binding targets to obtain the set of direct regulatory targets 
which are graphically represented in gene regulatory networks. 

In the current study, we used targeted perturbation data 
obtained by a co-author of this study (P.K.). The targeted 
perturbation data was obtained from 1,484 gene deletion (mutant) 
experiments. Full details of experimental procedures, normaliza- 
tion procedures and statistical analyses are described in [29]. In 
summary, mutants from independent cultures were analyzed on 
dual-channel 70-mer oligonucleotide arrays using a batch of wild- 
type RNA as a common reference. In addition, wild-type profiles 
were obtained to statistically assess differences with mutant 
profiles. All gene expression profiles were normalized by loess 
method [30] followed by gene-specific dye-bias correction [31]. 
Differentially expressed genes between wild-type and mutant 
profiles were determined using limma [32] at 5% alpha level 
adjusted for multiple comparisons using the methodology of 
[23,24]. 
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Figure 4. Topological analysis of gold-standard gene regulatory network #1. The analysis was performed in Cytoscape with 
NetworkAnalyzer. 
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For the binding data, we used a previously published ChlP-chip 
dataset characterizing binding activity of 203 transcription factors 
to genes [33]. The original study [33] suggested using two 
thresholds (0.001 and 0.005) for assessing significance of binding 



interactions. To further filter false-positive binding relations, the 
study [33] suggested assessing evolutionary conservation of 
binding sequences in 0, 1, or 2 of the related Saccharomyces 
species. The primary approach used in the current study for 



Table 2. Overlapping identified binding with regulatory relations results in gold-standard networks with direct regulatory 
relations. 



Gold-standard network # 


Binding Network 




Regulatory 
network 


Gold-standard network (integrating binding 
and regulatory networks) 




Binding 
threshold 


Evolutionary 
conservation of 
binding sequences 
(# of species) 


# of edges 

(binding 

relations) 


# of edges 
(regulatory 
relations) 


# of edges 
(direct regulatory 
relations) 


Statistical significance of 
the overlap (p- value from 
hypergeometric test) 


1 


0.001 


2 


4,034 


991,444 


1,083 


<10~ 16 


2 


0.005 


1 


8,392 




1,785 


<10~ 16 


3 


0.005 


0 


13,050 




2,403 


<10~ 16 
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Figure 5. ROC curve of the Pareto frontier for sensitivity/ 
specificity pairs obtained by application of 18 network 
reverse-engineering approaches to 1 3 datasets. 

doi:1 0.1 371 /journal.pone.01 06479.g005 

identification of binding relations is based on the most conserva- 
tive analysis of the above ChlP-chip data with binding threshold 
= 0.001 and conservation in 2 species (resulting in "binding 
network #1"). In addition, we report in Supporting Information 
results for two other approaches: binding threshold = 0.005 and 
conservation in 1 species (resulting in "binding network #2") and 
binding threshold = 0.005 without conservation requirement 
(resulting in "binding network #3"). 

Finally, before the identified regulatory and binding relations 
were overlapped, all gene names were converted to systematic 
gene names using Saccharomyces Genome Database [34]. Any 
gene that has no mapping or ambiguous mapping to a systematic 
name was removed. This resulted in 5,395 common genes 
between targeted perturbation and binding data. 

Datasets for gene network reverse-engineering 

We obtained 13 datasets to be used for reverse-engineering of S. 
cerevisiae gene regulatory networks. Datasets and their character- 
istics are listed in Table 10. The datasets span over 4 data types: 

(i) observational data consisting of biological wild-type replicates, 

(ii) observational data obtained by changing time and/or 
environmental conditions, (iii) compendium (semi-perturbation) 
data, and (iv) perturbation data. Data types (i) and (ii) contain 
samples collected by passive observation of the system without 
specific interference on the levels of genes. Data type (iii) was 
obtained by merging data from a large number of studies available 
in major public microarray data repositories. Those studies were 
predominantly perturbations-based (with gene knock-outs/over- 
expressions), and therefore we refer to such compendium data as 
"semi-perturbation". Data type (iv) originates from gene knock- 
out/over-expression experiments. Out of 13 datasets used in the 
study, the following two are novel and are thus described in more 
detail below. 

Dataset Gresham was obtained by a co-author of this study 
(D.G.), and it describes the transcriptional response of 5,590 S. 



cerevisiae genes to dynamic changes in environmental nitrogen. 
Cells in nitrogen limited chemostats were treated with an excess of 
nitrogen, and the transcriptional response was assessed at different 
time intervals after the nitrogen treatment, resulting in 100 gene 
expression profiles [35]. 

Dataset GPL90 was compiled by using all microarray chips 
from Affymetrix Yeast Genome S98 Array available in GEO [4]. 
Specifically, 1,509 chips with raw data (CEL files) were 
downloaded from GEO on 08/21/2013. RMA normalization 
[36] was performed on all samples using Matlab function affyrma. 
Data for 39 out of 1,509 chips could not be processed and 
therefore discarded. The remaining data for 1,470 chips were 
processed as one batch. Affymetrix probe sets were mapped to 
gene names by a customized Matlab script using the platform 
annotation table for GPL90 (available on GEO) as reference. A 
total number of 6,740 genes over 1,470 samples were obtained 
upon completion of the process described above. The resulting 
dataset is provided in Spreadsheet S4. 

Statistical methods for gene network reverse- 
engineering 

This study uses de-novo statistical association-based approaches 
for network reverse-engineering [11,12,22—24] because they are 
state-of-the-art [19] and are most prevalent in the community. 
This is a very broad class of methods and it encompasses both 
traditional bivariate approaches (that consider only two genes/ 
variables at a time) and multivariate approaches (that perform 
conditioning based on other genes/variables). For the latter 
methods we use causal graph-based techniques from the Gener- 
alized Local Learning (GLL) algorithmic family [11,12]. Under 
fairly broad distributional assumptions, GLL provably discovers 
genes/variables that are direct causes and direct effects of the 
gene/variable of interest [11,12], and is known to be one of the 
best performing methods for de novo gene network reverse- 
engineering [19]. 

When we infer gene networks in this study, we follow the 
"divide-and-conquer" (also known as "local-to-global") approach 
whereby we first iteratively run each method to find direct 
upstream or downstream regulatory relations for each gene in the 
dataset, and then piece together the network. It may happen that 
the algorithm run on gene X may output that Y has a direct 
regulatory relation with X, however when the algorithm is run on 
gene Y, X does not belong to its output. We thus apply one of the 
two post-processing steps to piece together the network: (i) "AND" 
rule which implies that if the algorithm run on X outputs Y and if 
the algorithm run on Y outputs X, then X and Y have an edge in 
the resulting network, and (ii) "OR" rule which implies that if the 
algorithm run on X outputs Y or if the algorithm run on Y outputs 
X, then X and Y have an edge in the resulting network. 
Application of AND rule results in sparser networks, and OR rule 
results in denser networks. 

The list of 18 approaches for network reverse-engineering is 
given in Table 11. Methods are based on two statistical 
association tests: Fisher's Z [37] and G 2 [38] test. The latter test 
requires application to categorical data, and therefore we 
discretized gene expression data into ternary by standardizing it 
to mean 0 and standard deviation 1 and considering three 
categories: smaller than -1, between -1 and 1, and greater than 1. 

Finally, we note that all of the above approaches used in this 
study output undirected networks. Inference of directed networks 
from data remains a more challenging problem that is beyond the 
scope of the present study. 
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Figure 6. ROC curves of the Pareto frontier for sensitivity/specificity pairs obtained by application of 18 network reverse- 
engineering approaches to datasets of each type. 

doi:1 0.1 371 /journal.pone.01 06479.g006 



Metrics to assess accuracy of gene network reverse- 
engineering 

To assess accuracy of the network reverse-engineering, we used 
4 core and 3 combined performance metrics. The core metrics 
used are: positive predictive value (PPV, also known as precision), 
negative predictive value (NPV), sensitivity (also known as recall), 
and specificity. PPV measures the probability that a regulatory 
interaction discovered by the algorithm exists in the gold-standard 
(i.e., the precision of the output network), while NPV measures the 
probability that an interaction not predicted by the algorithm does 
not exist in the gold-standard. Sensitivity measures the proportion 
of interactions in the gold-standard that are discovered by the 
algorithm (i.e., the completeness of the output network), whereas 
specificity measures the proportion of interactions absent in the 
gold-standard that are not predicted by the algorithm. The value 
of core metrics ranges from 0 to 1, with larger values 
corresponding to a more accurate algorithm. 

Each of the three combined metrics was based on the two core 
antagonistic metrics and measured the Euclidean distance from 
the optimal algorithms with (PPV= 1, NPV= 1), (sensitivity = 1, 
specificity = 1), and (recall = 1, precision = 1): 



and 



(\-PPVy + (\-NPV) 2 , 
v/(l — sensitivity) 1 + (1 — specificity) 2 



(1— recall) 1 + (1— precision) 2 , respectively. These metrics take 
values between 0 and y2, where 0 denotes performance of the 
optimal algorithm and V2 denotes performance of the worst 
possible algorithm. A smaller value for either of these two metrics 
implies a more accurate algorithm. 

Statistical significance of the output networks was assessed using 
the hyper-geometric test at 5% alpha level adjusted for multiple 
comparisons using the methodology of [23,24]. The adjustment 
was performed over 3 (gold-standards) xl8 (methods) xl3 
(datasets) = 702 applications of network reverse-engineering 
algorithms. 

Assessing correlation between connectivity of 
transcription factors and the accuracy of learning their 
sub-networks 

For every transcription factor we measured its total connectivity 
(either in the inferred or gold-standard network) and accuracy of 
learning its sub-network measured by one of the three combined 
metrics mentioned in the previous subsection. Then we measured 
correlation using Spearman correlation coefficient and assessed 
significance of correlation using exact statistical test following the 
theory of Good [39]. The exact test is essential because 
transcription factors are not independent of each other. This test 



PLOS ONE | www.plosone.org 



10 



September 2014 | Volume 9 | Issue 9 | e1 06479 



De-Novo Learning of Genome-Scale Regulatory Networks in 5. cerevisiae 



o 
I 



in in in m m 



M Lfl CI IN IN IN IN 



IN IN IN 



0 
I 



CO CO CO CO CO 
Q\ Q\ Q\ &\ 



j oi oi 
°- ®- ®- 

IN 



COOOCOCOCOCOCOCOCO 



•9 M KT 



(N IN IN 



CO CO CO CO CO 



<xi 0* CO CO CO 



N CI M 



ttt ro ro ro ro m ro 



C\ Oi Oi 



INININININMINClfOINININMINMINfOIN 



0\ Oi Oi 



II 



INININININC)f)f>fOINININININMIN 



X V 

i I 



D Oi 



01 01 01 



00 00 00 



™ ™ n N 



W Oi ^ N a) ra 



E 

t 

c 
a 
E 



E 
in 



Oi 01 01 



cn m cn m m m 



WCNtNfNMrNMfNfOM 



> 



(N (N (N m m 



T3 



ro 

aj 
c 

T3 

c 

rc 

> 
Q_ 
Q_ 



o 



_ a 
o — 



o S 



COCOCOCOCOCOCOCOCOOO 



rsi rM cn oj 



IN IN IN IN IN 7TT fO 



CO CO CO 

CTi Oi CTi 



COCOCOCOCOCOCOCO 



rorNrNrNrNrsjrMrsj 



CO oo C» CO CO CO CO 



cn d 0"i 



Oi (Ji CTi 



"D 

QJ 



Q 
< 



< 



re 



o 
u 



cn cn cn cn cn cn 



NNNNISllSlMMNJ 



cn cn cn cn cn cn 
i— i— rsj rsi ro f 



<"\|<"NrNirM<N<"\irNfNfN 



Tm CO 



PLOS ONE | www.plosone.org 



11 



September 2014 | Volume 9 | Issue 9 | e1 06479 



De-Novo Learning of Genome-Scale Regulatory Networks in S. cerevisiae 



S < 



.11 

a. c 
x a 
w a 

I I 

JUL 



T3 



o 

ro 



Q. 

O 



ro 
T3 



IS 



_ a 
o — 



> 01 

% o 



8 s 











c 








3 


CO 


00 


CO 










>- 


d 


d 


d 



CO CO CO CO CO 

0"> CTi CXi CTi Ch 



W OJ 

dodo 



CO CO CO CO CO CO 

Ci (Ti Ci 0> O 



5> C Ol 



COCOCOCOCOCOCOOOOO 



CO CO CO 

CJ\ o\ 



Q\ Q\ Q\ 



CO CO CO 



Q\ Ol Ol 



5> Oi Oi 



CO CO CO 

Ch CTi CFi 



in vo ro \o 
cy» c\ oi oi 



G\ Oi Oi 



CO CO CO 
CTi Ol CTi 



O 

of 
Q 



CO CO CO 

CTi O 



N N 

aj a> 



CD U) U) tJ> cn cn 



N N N N N Nl M 





_0) 


















h 








cc 


3 


< 


p 


cc 


of 


DC 


O 


FD 


FD 



d W (J( 



CftCOCOCOCOCOCOCOCOCftOO 



r-vcocococococococo 



COCOC0CT^CriC^OiCT\0T» 
CJ\ CTi CTi 0"i 0"i 



Q 
< 



Dl Dl Oi O) Ol Ol 
i— rN r\l ro ro 



2 S 

s < 

< & 

I * 

a to 



S 2 



£ E 



3 



r-l rsi (N rM 



PLOS ONE | www.plosone.org 



12 



September 2014 | Volume 9 | Issue 9 | e1 06479 



De-Novo Learning of Genome-Scale Regulatory Networks in S. cerevisiae 



01 



cn cn cn m m 



MCNCNCNCNCNNCNCN 



0 
X 



w m in cn 



* * 



o 
I 



m is cn cn 



rN rN rN rN 



0_ 

IN 



1^ ^ ^ O 



rN rN rN 



m m ro 



M M M M fO fO 



P! >- vo vo 



cn cn cn cn 
o o o o 
dodo 
in co rs t- 
m m «* «- 



rn (N cn m m m 



M M M 

o o o 

o_ o_ o_ 

* CN * 

o o o 



< 
t: 



CNCNCNNCNcnclcncncNCNCNcnnlcncNcncN 



cn in N 
^ * in 
odd 



d 



X V 

i I 



ro M ro fi ro 



(NMlNfNCNMfNfOtN 



i- r- rs 



m m rN 



CO CO CO 



E 

t 

c 
m 
E 
c 
o 



cn in cn cn cn m 



E 
in 



i- o 

oo ■- 
d d 



cn cn cn cn cn 



NincNcNcNroNrorNmrN 
ooooooooooo 
d_o_d_d_d_d_d_d_o_o_o_ 
cnsti-ioo\oicNOicNiDf- 

OONrsrsi-CNi-CNt-CN 

ddddddddddd 
incncNcNcNcncncncncncn 



J2 
O 



in 10 10 



(N rM rN rN 



in *o m *o m i£> 



IN M N IN PI 



mrNrNrNrNrNrNrN 



_ a 
o — 



lO t- i- r- O <- 



O S 



VO VO VO 



IN 
O 



0) 

Q. 

T3 
C 

ra 

> 



Q 
z 
< 



a 
z 
< 



u 

0) 

rs 

IS 



0 

u 



W O D) Dl CTl Oi 



NNNNIS1IS1N1MNJ 



rji rji rji rji rji cj> 
i— i— rN rN ro f 



rNrNrNrNrNrNrNrNfN 



Tm CO 



PLOS ONE | www.plosone.org 



13 



September 2014 | Volume 9 | Issue 9 | e1 06479 



De-Novo Learning of Genome-Scale Regulatory Networks in S. cerevisiae 



S < 



.11 

a. c 
x a 
w a 

I I 

JUL 



> 

Q_ 
Q_ 

T3 
C 

ro 



O 



a. 

o 



ro 
T3 



00 
91 

-Q 

IS 



_ a 
o — 



> 01 

£ o 



M M M M P0 fO 



m in m 



t— t— r- m 





K5 
ro 


00 

m 


rs 










ro 


00 

ro 


CO 


CO 



ro fN M fN M fN fN 



m r\i m ro 



fO fO fO ro P0 P0 



1*1 PO 1*1 P0 fl 



ro ro ro ro ro 



PO PO PO M PO PO 



ro ro ro ro ro ro ro 



in fN po ro ro ro 



in fN po po po po 



fN t— i- 



(N rN IS « \0 vo 
po po po po po po 



rN fN ro ro 



ro ro ro ro fN 



* m *o m rN 
ro ro ro ro fN 



vo m in rs »- 



ro ro ro ro ro ro 



fN fN fN 



ro ro ro ro ro 



ro ro ro ro ro 



\Q \Q vQ CTi 



ro m ro ro 



cn co o\ 



po ^ ^ _ 



CO CO 

ro ro ro ro 



N N 
aj ai 



Dl Dl Ol Dl Ol Ol 



N N N N M rsl M 





_0) 


















h 








cc 


ZS 


< 


p 


CC 


of 


DC 


O 


FD 


FD 



Q 
< 



Dl Dl Oi CD Ol Ol 
i— rN rs ro ro 



a 

3 >• 
m ra 



S § 



£ E 



3 



rN fN (N fN 



fjijijijijijajoQQ 



PLOS ONE | www.plosone.org 



14 



September 2014 | Volume 9 | Issue 9 | e1 06479 



De-Novo Learning of Genome-Scale Regulatory Networks in S. cerevisiae 



Spearman Correlation Coef. = -0.942, p<0.001 




accuracy Combined PPV/NPV accuracy Correlation Coefficients 

Figure 7. Example scatter-plot of transcription factor connectivity versus the accuracy (combined PPV/NPV metric) of 
reconstructing their sub-networks. The left panel shows the scatter-plot and the right panel shows the null distribution for establishing 
statistical significance of the observed correlation. 
doi:1 0.1 371 /journal, pone.01 06479.g007 



Table 9. Number of networks that have significant correlations between transcription factor connectivity and accuracy of 
reconstructing their sub-networks. 



Gold-Standard Network Inferred Network 



Approach 


Sensitivity/ 
Specificity 


PPV/NPV 


Sensitivity /PPV 


Sensitivity/ 
Specificity 


PPV/NPV 


Sensitivity/ 
PPV 


BIVARIATE_Z_FDR_AND 


1 


1 


1 


8 


1 


10 


BIVARIATE_Z_FDR_OR 


1 


0 


1 


9 


1 


11 


BIVARIATE_Z_ALPHA 


1 


0 


2 


9 


1 


8 


GLL_Z_1_AND 


0 


3 


0 


6 


0 


2 


GLL_Z_1_OR 


0 


4 


0 


5 


0 


1 


GLL_Z_2_AND 


0 


5 


3 


13 


5 


1 


GLL_Z_2_OR 


0 


7 


0 


13 


5 


1 


GLL_Z_3_AND 


0 


0 


2 


13 


2 


0 


GLL_Z_3_OR 


0 


7 


1 


13 


3 


1 


BIVARIATE_G_FDR_AND 


0 


0 


1 


5 


3 


11 


BIVARIATE_G_FDR_OR 


0 


0 


1 


7 


3 


10 


BIVARIATE_G_ALPHA 


0 


1 


1 


7 


3 


9 


GLL_G_1_AND 


0 


1 


0 


4 


5 


10 


GLL_G_1_OR 


0 


2 


0 


4 


1 


8 


GLL_G_2_AND 


0 


2 


0 


6 


6 


8 


GLL_G_2_OR 


0 


3 


0 


5 


1 


5 


GLL_G_3_AND 


0 


2 


0 


7 


5 


8 


GLL_G_3_OR 


0 


3 


0 


5 


1 


5 



The correlations were assessed for 13 different networks (derived from 13 gene expression microarray datasets) for each combination of network reverse-engineering 
approaches and combined accuracy metrics. Statistical significance is assessed at 5% alpha level adjusted globally for multiple comparisons (over all statistical tests 
performed for the table). The left portion of the table corresponds to transcription factor connectivity assessed in the gold-standard network, and the right portion 
corresponds to transcription factor connectivity assessed in the inferred network. 
doi:10.1371/journal.pone.0106479.t009 
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Table 10. Datasets used for gene regulatory network reverse-engineering. 



Dataset type 


Dataset 
name 


Sample 
size 


Number 
of genes 


Description 


Source 


Reference 


Observational 

(Biological Wild-Type 
Replicates) 


Holstegel 


200 


6,170 


A collection of wild-type 
S. cerevisiae strain 
transcriptional profiles 


ArrayExpress 


[29] 


E-TABM-773 




Holstege2 


200 


6,170 


A collection of wild-type 
S. cerevisiae strain 
transcriptional profiles 


ArrayExpress 


[29] 


E-TABM-984 


Observational 

(Environment/Time) 


Gresham 


100 


5,590 


Environmental change induced 
transcription response in 
S. cerevisiae 


(to be submitted to GEO) 


[35] 




Gasch 


173 


6,152 


Environmental change induced 
transcription response in 
S. cerevisiae 


http://genome-www.stanford.edu/yeast_stress/data. 
shtml Accessed 8/20/2014 


[55] 




Smith 


220 


6,257 


Environmental change induced 
transcription response in 
5. cerevisiae 


GEO 


[56] 


GSE9376 




Yeung 


582 


5,717 


Time-dependent response to 
rapamycin in S. cerevisiae 


ArrayExpress 


[57] 


E-MTAB-412 


Semi-Perturbation 

(Compendium) 


M3D 


530 


5,520 


Compendium dataset for 5. 
cerevisiae 


Many Microbe Microarrays Database (M3D) 


[58] 


http://m3d.mssm.edu/ Accessed 8/20/2014 




GPL90 


1,470 


6,740 


Compendium dataset for 5. 
cerevisiae utilizing all samples 
from GPL90 platform (Affymetrix 
Yeast Genome S98 Array) 


GEO 


Constructed 
for this study 


GPL90 


Perturbation 


Hughesl 


291 


6,307 


Transcriptional response in 
S. cerevisiae induced by 
promoter-shutoff strains 


GEO 


[59] 


GSE1404 




Hughes2 


291 


6,307 


Transcriptional response in S. 
cerevisiae induced by transcription 
factor overexpression and deletion 


GEO 


[60] 


GSE5499 




Hu 


269 


6,429 


Transcriptional responses in 
S. cerevisiae induced by 
transcription factor deletion 


GEO 


[54] 


GSE4654 




Holstege3 


464 


6,170 


Transcriptional response in 
S. cerevisiae induced by protein 
kinases and phosphatases deletion 


ArrayExpress 


[61] 


E-TABM-907 




Holstege4 


319 


6,170 


Transcriptional response in S. 
cerevisiae induced by non-essential 
knockouts of chromatin modifiers 


ArrayExpress 


[62] 


E-TABM-1074 



doi:1 0.1 371 /journal.pone.01 06479.t01 0 



involved 1,000 permutations of gene identifiers for a fixed network 
structure and establishing a null distribution for Spearman 
correlation coefficients. The p-value was computed as proportion 
of permuted networks where correlation was higher in magnitude 
than the observed one. When we evaluated correlation between 
connectivity and accuracy for multiple networks and accuracy 
metrics, statistical significance was assessed at 5% alpha level 



adjusted for multiple comparisons using the methodology of 
[23,24]. 

Topological analysis and visualization of gene regulatory 
networks 

The topological analysis of gene regulatory networks was 
performed in Cytoscape software platform [40] (http:/ /www. 



PLOS ONE | www.plosone.org 



16 



September 2014 | Volume 9 | Issue 9 | e1 06479 



De-Novo Learning of Genome-Scale Regulatory Networks in S. cerevisiae 



Table 11. Statistical approaches used for gene regulatory network reverse-engineering. 



Approach # 


Algorithm 




Statistic 


Conditioning 


Post-Processing 


Abbreviation 


1 


Bivariate analysis 




Fisher's Z 


None 


FDR, AND rule 


BIVARIATE_Z_FDR_AND 


2 


Bivariate analysis 




Fisher's Z 


None 


FDR, OR rule 


BIVARIATE_Z_FDR_OR 


3 


Bivariate analysis 




Fisher's Z 


None 


Alpha 


BIVARIATE_Z_ALPHA 


4 


Multivariate causal graph-based 


(GLL) 


Fisher's Z 


1 


AND rule 


GLL_Z_1_AND 


5 


Multivariate causal graph-based 


(GLL) 


Fisher's Z 


1 


OR rule 


GLL_Z_1_OR 


6 


Multivariate causal graph-based 


(GLL) 


Fisher's Z 


2 


AND rule 


GLL_Z_2_AND 


7 


Multivariate causal graph-based 


(GLL) 


Fisher's Z 


2 


OR rule 


GLL_Z_2_OR 


8 


Multivariate causal graph-based 


(GLL) 


Fisher's Z 


3 


AND rule 


GLL_Z_3_AND 


9 


Multivariate causal graph-based 


(GLL) 


Fisher's Z 


3 


OR rule 


GLL_Z_3_OR 


10 


Bivariate analysis 




G 2 


None 


FDR, AND rule 


BIVARIATE_G_FDR_AND 


11 


Bivariate analysis 




G 2 


None 


FDR, OR rule 


BIVARIATE_G_FDR_OR 


12 


Bivariate analysis 




G 


None 


Alpha 


BIVARIATE_G_ALPHA 


13 


Multivariate causal graph-based 


(GLL) 


G 2 


1 


AND rule 


GLL_G_1_AND 


14 


Multivariate causal graph-based 


(GLL) 


G 2 


1 


OR rule 


GLL_G_1_OR 


15 


Multivariate causal graph-based 


(GLL) 


G 2 


2 


AND rule 


GLL_G_2_AND 


16 


Multivariate causal graph-based 


(GLL) 


G 2 


2 


OR rule 


GLL_G_2_OR 


17 


Multivariate causal graph-based 


(GLL) 


G 2 


3 


AND rule 


GLL_G_3_AND 


18 


Multivariate causal graph-based 


(GLL) 


G 2 


3 


OR rule 


GLL_G_3_OR 



"FDR" refers to thresholding associations at 5% FDR using the methodology of [23,24]. "Alpha" refers to thresholding associations at 5% alpha. "AND" rule implies that if 
the algorithm run on X outputs Y and if the algorithm run on Y outputs X, then X and Y have an edge in the resulting network, and (ii) "OR" rule implies that if the 
algorithm run on X outputs Y or if the algorithm run on Y outputs X, then X and Y have an edge in the resulting network. 
doi:1 0.1 371 /journal.pone.01 06479.t01 1 



cytoscape.org/) using NetAnalyzer plugin [41] (http://med.bioinf. 
mpi-inf.mpg.de/netanalyzer/). Detailed definitions and meaning 
of topological network parameters are given in [42]. Network 
visualization was performed using yED graph editor [43] (http:// 
www.yworks.com/). 

Discussion 

Comparison with prior results 

The results of the current study indicate that gene network 
reverse-engineering in S. cerevisiae is a challenging problem. 
Given prior work in the field, it is interesting to compare current 
results with the prior studies in 5. cerevisiae, while keeping in mind 
that prior studies used less comprehensive gold-standard networks 
(see Introduction and Table 1). Furthermore, the majority of 
prior work deals only with inferring likelihood scores of all possible 
network edges without establishing a threshold on these scores 
which would result in a discrete network [18,2 1] . The latter studies 
do not report accuracy metrics of gene network reverse-engineer- 
ing but typically report metrics related to ranking all possible 
network edges by the inferred likelihood scores. To the best of our 
knowledge, there are only two studies which inferred discrete 
genome-scale networks in S. cerevisiae. The study [20] applied two 
statistical methods, resulting in non-statistically significant net- 
works, both with PPV = 0. The study [19] used 6 versions of S. 
cerevisiae binding data-based gold-standard and applied 30 
approaches (many of which were not included in the current 
study) to learn a network. As can be seen in Table S7, results of 
the current study are much better in terms of sensitivity and 
specificity and related combined metric. However, in terms of 
PPV, NPV, and related combined metric, results are slightly worse 
(by 0.01 PPV). 



While this study focuses on genome-scale regulatory network 
reverse-engineering in S. cerevisiae, there was significant prior 
work in other model systems/organisms, e.g. E. coli [17-19,21]. 
Interestingly, inference of E. Coli networks seems to be an easier 
problem than inference of S. cerevisiae networks. For example, the 
best known result in terms of combined PPV/NPV metric for S. 
cerevisiae is 0.92 (PPV = 0.08 and NPV = 0.98) but for E. Coli it is 
0.36 (PPV = 0.64, NPV = 0.98) [19]. The results in terms of 
combined sensitivity/specific metric for S. cerevisiae are also worse 
than for E. Coli [19]. Others have also made similar observation 
for additional metrics [18]. It remains to be seen whether the 
difference in accuracy of learning S. cerevisiae and E. coli 
networks is due to the nature of transcription factor regulation, 
network complexity, quality of gold-standard networks, quality of 
datasets used for network learning, or combination of these factors. 

Towards improving accuracy of gene network reverse- 
engineering 

While there are theoretical challenges of network reverse- 
engineering from microarray data, e.g. impact of cellular 
aggregation on inference of statistical relations [44], we believe 
that there are several ways to improve the accuracy of learning 
gene regulatory networks. First, by further improving the quality 
and completeness of gold-standard networks. For example, one 
can improve networks obtained with current approaches by 
ensuring that all transcription factors participate in both binding 
and gene knockout data and by using a large number of biological 
replicates for gene knockouts. The binding data can be further 
improved by using ChlP-seq and inclusion of other indications of 
bindings, for example protein binding microarrays. Another 
possibility worth exploring is using protein-protein interaction 
data in addition to binding data which would allow enriching the 
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gold-standard networks that are currently based only on 
transcription factor-gene interactions. Second, by performing 
inference of gene networks from both observational and pertur- 
bation data with explicit knowledge of gene manipulations (current 
methods were not provided with information about targeted 
perturbations in the data). The latter methods (e.g., [45-49]) have 
promise because they allow to solve the theoretical problem of 
statistical indistinguishability of networks learned from observa- 
tional data alone [6]. 

More on interpretation and analysis of obtained results 

We used 4 widespread core performance metrics (sensitivity or 
recall, specificity, PPV or precision, and NPV) and 3 ways to 
combine them by equally weighting two antagonistic core 
performance metrics at a time (sensitivity and specificity, PPV 
and NPV, and recall and precision). Given that most methods 
output sparse graphs and the underlying gold-standard networks 
are also sparse, the combined sensitivity/specificity metric is 
significandy influenced by sensitivity (because many networks have 
specificity £0.90), and in particular combined PPV/NPV metric is 
largely influenced by PPV (because all networks but one have 
NPV^0.98). Combined recall/precision metric also suffers from 
similar issue since it is mostly influenced by sensitivity (because 
most methods have very low PPV^O.05). The interpretation of 
results and relevance to specific biological problems can be 
improved by using other combinations of core performance 
metrics (e.g., by using unequal weighting of PPV and NPV metrics 
in the Euclidean-based combined distance metric) or by devising 
new performance metrics. To facilitate the latter task, we are 
providing in Spreadsheet S5 detailed results with the numbers 
of true positive, true negative, false positive, and false negative 
edges computed for each network. 

Conclusions 

This study has two key contributions. First, we constructed high- 
quality genome-scale gold-standards of direct regulatory interac- 
tions in S. cerevisiae that incorporate binding and gene knockout 
data. Second, we used 7 performance metrics to assess accuracy of 
18 statistical association-based approaches for de-novo network 
reverse-engineering in 13 different real datasets spanning over 4 
data types (observational data consisting of biological wild-type 
replicates, observational data obtained by changing time and/or 
environmental conditions, compendium/semi-perturbation data, 
and perturbation data). We found that inference of genome-scale 
regulatory networks in S. cerevisiae is a challenging problem and 
quantified resulting accuracies, most of which are statistically 
significant (see Table S10). We also found significant variability of 
the network reverse-engineering accuracy among statistical 
approaches for network inference. When accuracy is assessed 
based on sensitivity/specificity or recall/precision combined 
metrics, bivariate analysis is the best approach, and when accuracy 
is assessed based on PPV/NPV combined metric, Generalized 
Local Learning (GLL) with conditioning on 2-3 genes is the best 
approach. On the other hand, the variability of the network 
reverse-engineering accuracy is much smaller among various 
datasets and data types compared to variability among statistical 
approaches. However, some datasets/ data types tend to dominate 
others for specific performance metrics, and in most cases using 
observational data consisting of biological wild-type replicates 
leads to worse accuracies compared with other datasets and data 
types. This indicates that considering that cost efficiency of various 
data types, observational data with changes in environments/time 
is preferable for network reconstruction. Finally, we found that for 



most reverse-engineering methods and accuracy metrics, connec- 
tivity of transcription factors is often significandy correlated with the 
reconstruction accuracy of their sub-networks. The correlations are 
sometimes robust and significant in multiple networks inferred from 
various datasets. Therefore, the connectivity measure may be used 
to identify transcription factors whose sub-networks can be learned 
accurately by de-novo reverse-engineering methods. We believe 
that the gene network reverse-engineering community will find this 
study useful in order to have a realistic perspective on this problem 
and performance of a variety of approaches. 

Supporting Information 

Figure SI Gold-standard gene regulatory network #2. 

(PDF) 

Figure S2 Direct regulatory interactions between tran- 
scription factors in gold-standard gene regulatory 
network #2. 

(PDF) 

Figure S3 Topological analysis of gold-standard gene 
regulatory network #2. 

(PDF) 

Figure S4 Gold-standard gene regulatory network #3. 

(PDF) 

Figure S5 Direct regulatory interactions between tran- 
scription factors in gold-standard gene regulatory 
network #3. 

(PDF) 

Figure S6 Topological analysis of gold-standard gene 
regulatory network #3. 

(PDF) 

Figure S7 De-novo reconstruction of the GCN4 sub- 
network. 

(PDF) 

Table SI Gold-standard network #1, sensitivity and 
specificity (panel A) and Euclidean distance from the 
optimal algorithm with sensitivity = 1 and specificity 
= 1 (panel B). 

(DOCX) 

Table S2 Gold-standard network #1, positive predic- 
tive value (PPV) and negative predictive value (NPV) 
(panel A) and Euclidean distance from the optimal 
algorithm with PPV = 1 and NPV = 1 (panel B). 

(DOCX) 

Table S3 Gold-standard network #1, recall (sensitivity) 
and precision (PPV) (panel A) and Euclidean distance 
from the optimal algorithm with recall = 1 and preci- 
sion = 1 (panel B) . 

(DOCX) 

Table S4 Gold-standard network #2, sensitivity and 
specificity (panel A) and Euclidean distance from the 
optimal algorithm with sensitivity = 1 and specificity 
= 1 (panel B). 

(DOCX) 

Table S5 Gold-standard network #2, positive predic- 
tive value (PPV) and negative predictive value (NPV) 
(panel A) and Euclidean distance from the optimal 
algorithm with PPV = 1 and NPV = 1 (panel B). 

(DOCX) 
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Table S6 Gold-standard network #2, recall (sensitivity) 
and precision (PPV) (panel A) and Euclidean distance 
from the optimal algorithm with recall = 1 and preci- 
sion = 1 (panel B) . 

(DOCX) 

Table S7 Gold-standard network #3, sensitivity and 
specificity (panel A) and Euclidean distance from the 
optimal algorithm with sensitivity = 1 and specificity 
= 1 (panel B). 

(DOCX) 

Table S8 Gold-standard network #3, positive predic- 
tive value (PPV) and negative predictive value (NPV) 
(panel A) and Euclidean distance from the optimal 
algorithm with PPV = 1 and NPV = 1 (panel B). 

(DOCX) 

Table S9 Gold-standard network #3, recall (sensitivity) 
and precision (PPV) (panel A) and Euclidean distance 
from the optimal algorithm with recall = 1 and preci- 
sion = 1 (panel B) . 

(DOCX) 

Table S10 Comparison of accuracy of gene network 
reverse-engineering with the prior study. 

(DOCX) 

Spreadsheet SI Regulatory Network. 

(XLSX) 
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